<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <title>Learning from Examples &raquo; k-means Clustering | Taskflow QuickStart</title>
  <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:400,400i,600,600i%7CSource+Code+Pro:400,400i,600" />
  <link rel="stylesheet" href="m-dark+documentation.compiled.css" />
  <link rel="icon" href="favicon.ico" type="image/vnd.microsoft.icon" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <meta name="theme-color" content="#22272e" />
</head>
<body>
<header><nav id="navigation">
  <div class="m-container">
    <div class="m-row">
      <span id="m-navbar-brand" class="m-col-t-8 m-col-m-none m-left-m">
        <a href="https://taskflow.github.io"><img src="taskflow_logo.png" alt="" />Taskflow</a> <span class="m-breadcrumb">|</span> <a href="index.html" class="m-thin">QuickStart</a>
      </span>
      <div class="m-col-t-4 m-hide-m m-text-right m-nopadr">
        <a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
          <path id="m-doc-search-icon-path" d="m6 0c-3.31 0-6 2.69-6 6 0 3.31 2.69 6 6 6 1.49 0 2.85-0.541 3.89-1.44-0.0164 0.338 0.147 0.759 0.5 1.15l3.22 3.79c0.552 0.614 1.45 0.665 2 0.115 0.55-0.55 0.499-1.45-0.115-2l-3.79-3.22c-0.392-0.353-0.812-0.515-1.15-0.5 0.895-1.05 1.44-2.41 1.44-3.89 0-3.31-2.69-6-6-6zm0 1.56a4.44 4.44 0 0 1 4.44 4.44 4.44 4.44 0 0 1-4.44 4.44 4.44 4.44 0 0 1-4.44-4.44 4.44 4.44 0 0 1 4.44-4.44z"/>
        </svg></a>
        <a id="m-navbar-show" href="#navigation" title="Show navigation"></a>
        <a id="m-navbar-hide" href="#" title="Hide navigation"></a>
      </div>
      <div id="m-navbar-collapse" class="m-col-t-12 m-show-m m-col-m-none m-right-m">
        <div class="m-row">
          <ol class="m-col-t-6 m-col-m-none">
            <li><a href="pages.html">Handbook</a></li>
            <li><a href="namespaces.html">Namespaces</a></li>
          </ol>
          <ol class="m-col-t-6 m-col-m-none" start="3">
            <li><a href="annotated.html">Classes</a></li>
            <li><a href="files.html">Files</a></li>
            <li class="m-show-m"><a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
              <use href="#m-doc-search-icon-path" />
            </svg></a></li>
          </ol>
        </div>
      </div>
    </div>
  </div>
</nav></header>
<main><article>
  <div class="m-container m-container-inflatable">
    <div class="m-row">
      <div class="m-col-l-10 m-push-l-1">
        <h1>
          <span class="m-breadcrumb"><a href="Examples.html">Learning from Examples</a> &raquo;</span>
          k-means Clustering
        </h1>
        <nav class="m-block m-default">
          <h3>Contents</h3>
          <ul>
            <li><a href="#KMeansProblemFormulation">Problem Formulation</a></li>
            <li><a href="#ParallelKMeansUsingCPUs">Parallel k-means using CPUs</a></li>
            <li><a href="#KMeansBenchmarking">Benchmarking</a></li>
          </ul>
        </nav>
<p>We study a fundamental clustering problem in unsupervised learning, <em>k-means clustering</em>. We will begin by discussing the problem formulation and then learn how to write a parallel k-means algorithm.</p><section id="KMeansProblemFormulation"><h2><a href="#KMeansProblemFormulation">Problem Formulation</a></h2><p>k-means clustering uses <em>centroids</em>, k different randomly-initiated points in the data, and assigns every data point to the nearest centroid. After every point has been assigned, the centroid is moved to the average of all of the points assigned to it. We describe the k-means algorithm in the following steps:</p><ul><li>Step 1: initialize k random centroids</li><li>Step 2: for every data point, find the nearest centroid (L2 distance or other measurements) and assign the point to it</li><li>Step 3: for every centroid, move the centroid to the average of the points assigned to that centroid</li><li>Step 4: go to Step 2 until converged (no more changes in the last few iterations) or maximum iterations reached</li></ul><p>The algorithm is illustrated as follows:</p><img class="m-image" src="kmeans_1.png" alt="Image" /><p>A sequential implementation of k-means is described as follows:</p><pre class="m-code"><span class="c1">// sequential implementation of k-means on a CPU</span>
<span class="c1">// N: number of points</span>
<span class="c1">// K: number of clusters</span>
<span class="c1">// M: number of iterations</span>
<span class="c1">// px/py: 2D point vector </span>
<span class="kt">void</span><span class="w"> </span><span class="nf">kmeans_seq</span><span class="p">(</span><span class="w"></span>
<span class="w">  </span><span class="kt">int</span><span class="w"> </span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">M</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">px</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">py</span><span class="w"></span>
<span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>

<span class="w">  </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span><span class="w"> </span><span class="n">c</span><span class="p">(</span><span class="n">K</span><span class="p">);</span><span class="w"></span>
<span class="w">  </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span><span class="w"> </span><span class="n">sx</span><span class="p">(</span><span class="n">K</span><span class="p">),</span><span class="w"> </span><span class="n">sy</span><span class="p">(</span><span class="n">K</span><span class="p">),</span><span class="w"> </span><span class="n">mx</span><span class="p">(</span><span class="n">K</span><span class="p">),</span><span class="w"> </span><span class="n">my</span><span class="p">(</span><span class="n">K</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="c1">// initial centroids</span>
<span class="w">  </span><span class="n">std</span><span class="o">::</span><span class="n">copy_n</span><span class="p">(</span><span class="n">px</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="n">mx</span><span class="p">.</span><span class="n">begin</span><span class="p">());</span><span class="w"></span>
<span class="w">  </span><span class="n">std</span><span class="o">::</span><span class="n">copy_n</span><span class="p">(</span><span class="n">py</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="n">my</span><span class="p">.</span><span class="n">begin</span><span class="p">());</span><span class="w"></span>
<span class="w">  </span>
<span class="w">  </span><span class="c1">// k-means iteration</span>
<span class="w">  </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">m</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">m</span><span class="o">&lt;</span><span class="n">M</span><span class="p">;</span><span class="w"> </span><span class="n">m</span><span class="o">++</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>

<span class="w">    </span><span class="c1">// clear the storage</span>
<span class="w">    </span><span class="n">std</span><span class="o">::</span><span class="n">fill_n</span><span class="p">(</span><span class="n">sx</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="mf">0.0f</span><span class="p">);</span><span class="w"></span>
<span class="w">    </span><span class="n">std</span><span class="o">::</span><span class="n">fill_n</span><span class="p">(</span><span class="n">sy</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="mf">0.0f</span><span class="p">);</span><span class="w"></span>
<span class="w">    </span><span class="n">std</span><span class="o">::</span><span class="n">fill_n</span><span class="p">(</span><span class="n">c</span><span class="p">.</span><span class="n">begin</span><span class="p">(),</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">);</span><span class="w"></span>

<span class="w">    </span><span class="c1">// find the best k (cluster id) for each point</span>
<span class="w">    </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="n">N</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="kt">float</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">px</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">      </span><span class="kt">float</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">py</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">      </span><span class="kt">float</span><span class="w"> </span><span class="n">best_d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">numeric_limits</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;::</span><span class="n">max</span><span class="p">();</span><span class="w"></span>
<span class="w">      </span><span class="kt">int</span><span class="w"> </span><span class="n">best_k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">k</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">        </span><span class="k">const</span><span class="w"> </span><span class="kt">float</span><span class="w"> </span><span class="n">d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">L2</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">mx</span><span class="p">[</span><span class="n">k</span><span class="p">],</span><span class="w"> </span><span class="n">my</span><span class="p">[</span><span class="n">k</span><span class="p">]);</span><span class="w"></span>
<span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">d</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">best_d</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">          </span><span class="n">best_d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">d</span><span class="p">;</span><span class="w"></span>
<span class="w">          </span><span class="n">best_k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">k</span><span class="p">;</span><span class="w"></span>
<span class="w">        </span><span class="p">}</span><span class="w"></span>
<span class="w">      </span><span class="p">}</span><span class="w"></span>
<span class="w">      </span><span class="n">sx</span><span class="p">[</span><span class="n">best_k</span><span class="p">]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="n">sy</span><span class="p">[</span><span class="n">best_k</span><span class="p">]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">y</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="n">c</span><span class="w"> </span><span class="p">[</span><span class="n">best_k</span><span class="p">]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>

<span class="w">    </span><span class="c1">// update the centroid</span>
<span class="w">    </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="o">++</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="k">const</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">[</span><span class="n">k</span><span class="p">]);</span><span class="w">  </span><span class="c1">// turn 0/0 to 0/1</span>
<span class="w">      </span><span class="n">mx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">count</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="n">my</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sy</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">count</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>
<span class="w">  </span><span class="p">}</span><span class="w"></span>

<span class="w">  </span><span class="c1">// print the k centroids found</span>
<span class="w">  </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">k</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">    </span><span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="s">&quot;centroid &quot;</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="s">&quot;: &quot;</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">setw</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="n">mx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="sc">&#39; &#39;</span><span class="w"></span>
<span class="w">                                          </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">setw</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="n">my</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="sc">&#39;\n&#39;</span><span class="p">;</span><span class="w"></span>
<span class="w">  </span><span class="p">}</span><span class="w"></span>
<span class="p">}</span><span class="w"></span></pre></section><section id="ParallelKMeansUsingCPUs"><h2><a href="#ParallelKMeansUsingCPUs">Parallel k-means using CPUs</a></h2><p>The second step of k-means algorithm, <em>assigning every point to the nearest centroid</em>, is highly parallelizable across individual points. We can create a <em>parallel-for</em> task to run parallel iterations.</p><pre class="m-code"><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span><span class="w"> </span><span class="n">best_ks</span><span class="p">(</span><span class="n">N</span><span class="p">);</span><span class="w">  </span><span class="c1">// nearest centroid of each point</span>

<span class="kt">unsigned</span><span class="w"> </span><span class="n">P</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">12</span><span class="p">;</span><span class="w">  </span><span class="c1">// 12 partitioned tasks</span>

<span class="c1">// update cluster</span>
<span class="n">taskflow</span><span class="p">.</span><span class="n">for_each_index</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="p">[</span><span class="o">&amp;</span><span class="p">](</span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="p">){</span><span class="w"></span>
<span class="w">  </span><span class="kt">float</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">px</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">  </span><span class="kt">float</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">py</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">  </span><span class="kt">float</span><span class="w"> </span><span class="n">best_d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">numeric_limits</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;::</span><span class="n">max</span><span class="p">();</span><span class="w"></span>
<span class="w">  </span><span class="kt">int</span><span class="w"> </span><span class="n">best_k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="w">  </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">k</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">    </span><span class="k">const</span><span class="w"> </span><span class="kt">float</span><span class="w"> </span><span class="n">d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">L2</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">mx</span><span class="p">[</span><span class="n">k</span><span class="p">],</span><span class="w"> </span><span class="n">my</span><span class="p">[</span><span class="n">k</span><span class="p">]);</span><span class="w"></span>
<span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">d</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">best_d</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="n">best_d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">d</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="n">best_k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">k</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>
<span class="w">  </span><span class="p">}</span><span class="w"></span>
<span class="w">  </span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">best_k</span><span class="p">;</span><span class="w"></span>
<span class="p">});</span><span class="w"></span></pre><p>The third step of moving every centroid to the average of points is also parallelizable across individual centroids. However, since k is typically not large, one task of doing this update is sufficient.</p><pre class="m-code"><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace</span><span class="p">([</span><span class="o">&amp;</span><span class="p">](){</span><span class="w"></span>
<span class="w">  </span><span class="c1">// sum of points</span>
<span class="w">  </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="n">N</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">++</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">    </span><span class="n">sx</span><span class="p">[</span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">px</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">    </span><span class="n">sy</span><span class="p">[</span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">py</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">    </span><span class="n">c</span><span class="w"> </span><span class="p">[</span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
<span class="w">  </span><span class="p">}</span><span class="w"></span>
<span class="w">  </span>
<span class="w">  </span><span class="c1">// average of points</span>
<span class="w">  </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">k</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">    </span><span class="k">auto</span><span class="w"> </span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">[</span><span class="n">k</span><span class="p">]);</span><span class="w">  </span><span class="c1">// turn 0/0 to 0/1</span>
<span class="w">    </span><span class="n">mx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">count</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="n">my</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sy</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">count</span><span class="p">;</span><span class="w"></span>
<span class="w">  </span><span class="p">}</span><span class="w"></span>
<span class="p">});</span><span class="w"></span></pre><p>To describe <code>M</code> iterations, we create a condition task that loops the second step of the algorithm by <code>M</code> times. The return value of zero goes to the first successor which we will connect to the task of the second step later; otherwise, k-means completes.</p><pre class="m-code"><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace</span><span class="p">([</span><span class="n">m</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="n">M</span><span class="p">]()</span><span class="w"> </span><span class="k">mutable</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">  </span><span class="k">return</span><span class="w"> </span><span class="p">(</span><span class="n">m</span><span class="o">++</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">M</span><span class="p">)</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="o">:</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
<span class="p">});</span><span class="w"></span></pre><p>The entire code of CPU-parallel k-means is shown below. Here we use an additional storage, <code>best_ks</code>, to record the nearest centroid of a point at an iteration.</p><pre class="m-code"><span class="c1">// N: number of points</span>
<span class="c1">// K: number of clusters</span>
<span class="c1">// M: number of iterations</span>
<span class="c1">// px/py: 2D point vector </span>
<span class="kt">void</span><span class="w"> </span><span class="nf">kmeans_par</span><span class="p">(</span><span class="w"></span>
<span class="w">  </span><span class="kt">int</span><span class="w"> </span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">M</span><span class="p">,</span><span class="w"> </span><span class="n">cconst</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">px</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;&amp;</span><span class="w"> </span><span class="n">py</span><span class="w"></span>
<span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>

<span class="w">  </span><span class="kt">unsigned</span><span class="w"> </span><span class="n">P</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">12</span><span class="p">;</span><span class="w">  </span><span class="c1">// 12 partitions of the parallel-for graph</span>

<span class="w">  </span><span class="n">tf</span><span class="o">::</span><span class="n">Executor</span><span class="w"> </span><span class="n">executor</span><span class="p">;</span><span class="w"></span>
<span class="w">  </span><span class="n">tf</span><span class="o">::</span><span class="n">Taskflow</span><span class="w"> </span><span class="n">taskflow</span><span class="p">(</span><span class="s">&quot;K-Means&quot;</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span><span class="w"> </span><span class="n">c</span><span class="p">(</span><span class="n">K</span><span class="p">),</span><span class="w"> </span><span class="n">best_ks</span><span class="p">(</span><span class="n">N</span><span class="p">);</span><span class="w"></span>
<span class="w">  </span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span><span class="w"> </span><span class="n">sx</span><span class="p">(</span><span class="n">K</span><span class="p">),</span><span class="w"> </span><span class="n">sy</span><span class="p">(</span><span class="n">K</span><span class="p">),</span><span class="w"> </span><span class="n">mx</span><span class="p">(</span><span class="n">K</span><span class="p">),</span><span class="w"> </span><span class="n">my</span><span class="p">(</span><span class="n">K</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="c1">// initial centroids</span>
<span class="w">  </span><span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="n">init</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace</span><span class="p">([</span><span class="o">&amp;</span><span class="p">](){</span><span class="w"></span>
<span class="w">    </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="n">mx</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">px</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">      </span><span class="n">my</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">py</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>
<span class="w">  </span><span class="p">}).</span><span class="n">name</span><span class="p">(</span><span class="s">&quot;init&quot;</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="c1">// clear the storage</span>
<span class="w">  </span><span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="n">clean_up</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace</span><span class="p">([</span><span class="o">&amp;</span><span class="p">](){</span><span class="w"></span>
<span class="w">    </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">k</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="n">sx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">0.0f</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="n">sy</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">0.0f</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="n">c</span><span class="w"> </span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>
<span class="w">  </span><span class="p">}).</span><span class="n">name</span><span class="p">(</span><span class="s">&quot;clean_up&quot;</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="c1">// update cluster</span>
<span class="w">  </span><span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="n">pf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">for_each_index</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="p">[</span><span class="o">&amp;</span><span class="p">](</span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="p">){</span><span class="w"></span>
<span class="w">    </span><span class="kt">float</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">px</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">    </span><span class="kt">float</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">py</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">    </span><span class="kt">float</span><span class="w"> </span><span class="n">best_d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span><span class="o">::</span><span class="n">numeric_limits</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;::</span><span class="n">max</span><span class="p">();</span><span class="w"></span>
<span class="w">    </span><span class="kt">int</span><span class="w"> </span><span class="n">best_k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">k</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="k">const</span><span class="w"> </span><span class="kt">float</span><span class="w"> </span><span class="n">d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">L2</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">mx</span><span class="p">[</span><span class="n">k</span><span class="p">],</span><span class="w"> </span><span class="n">my</span><span class="p">[</span><span class="n">k</span><span class="p">]);</span><span class="w"></span>
<span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">d</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">best_d</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">        </span><span class="n">best_d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">d</span><span class="p">;</span><span class="w"></span>
<span class="w">        </span><span class="n">best_k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">k</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="p">}</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>
<span class="w">    </span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">best_k</span><span class="p">;</span><span class="w"></span>
<span class="w">  </span><span class="p">}).</span><span class="n">name</span><span class="p">(</span><span class="s">&quot;parallel-for&quot;</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="n">update_cluster</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace</span><span class="p">([</span><span class="o">&amp;</span><span class="p">](){</span><span class="w"></span>
<span class="w">    </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="n">N</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">++</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="n">sx</span><span class="p">[</span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">px</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">      </span><span class="n">sy</span><span class="p">[</span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">py</span><span class="p">[</span><span class="n">i</span><span class="p">];</span><span class="w"></span>
<span class="w">      </span><span class="n">c</span><span class="w"> </span><span class="p">[</span><span class="n">best_ks</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>

<span class="w">    </span><span class="k">for</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span><span class="w"> </span><span class="o">++</span><span class="n">k</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">      </span><span class="k">auto</span><span class="w"> </span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">[</span><span class="n">k</span><span class="p">]);</span><span class="w">  </span><span class="c1">// turn 0/0 to 0/1</span>
<span class="w">      </span><span class="n">mx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sx</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">count</span><span class="p">;</span><span class="w"></span>
<span class="w">      </span><span class="n">my</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sy</span><span class="p">[</span><span class="n">k</span><span class="p">]</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">count</span><span class="p">;</span><span class="w"></span>
<span class="w">    </span><span class="p">}</span><span class="w"></span>
<span class="w">  </span><span class="p">}).</span><span class="n">name</span><span class="p">(</span><span class="s">&quot;update_cluster&quot;</span><span class="p">);</span><span class="w"></span>
<span class="w">  </span>
<span class="w">  </span><span class="c1">// convergence check</span>
<span class="w">  </span><span class="n">tf</span><span class="o">::</span><span class="n">Task</span><span class="w"> </span><span class="n">condition</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">taskflow</span><span class="p">.</span><span class="n">emplace</span><span class="p">([</span><span class="n">m</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span><span class="w"> </span><span class="n">M</span><span class="p">]()</span><span class="w"> </span><span class="k">mutable</span><span class="w"> </span><span class="p">{</span><span class="w"></span>
<span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="p">(</span><span class="n">m</span><span class="o">++</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">M</span><span class="p">)</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="o">:</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"></span>
<span class="w">  </span><span class="p">}).</span><span class="n">name</span><span class="p">(</span><span class="s">&quot;converged?&quot;</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="n">init</span><span class="p">.</span><span class="n">precede</span><span class="p">(</span><span class="n">clean_up</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="n">clean_up</span><span class="p">.</span><span class="n">precede</span><span class="p">(</span><span class="n">pf</span><span class="p">);</span><span class="w"></span>
<span class="w">  </span><span class="n">pf</span><span class="p">.</span><span class="n">precede</span><span class="p">(</span><span class="n">update_cluster</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="n">condition</span><span class="p">.</span><span class="n">precede</span><span class="p">(</span><span class="n">clean_up</span><span class="p">)</span><span class="w"></span>
<span class="w">           </span><span class="p">.</span><span class="n">succeed</span><span class="p">(</span><span class="n">update_cluster</span><span class="p">);</span><span class="w"></span>

<span class="w">  </span><span class="n">executor</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">taskflow</span><span class="p">).</span><span class="n">wait</span><span class="p">();</span><span class="w"></span>
<span class="p">}</span><span class="w"></span></pre><p>The taskflow consists of two parts, a <code>clean_up</code> task and a parallel-for graph. The former cleans up the storage <code>sx</code>, <code>sy</code>, and <code>c</code> that are used to average points for new centroids, and the later parallelizes the searching for nearest centroids across individual points using 12 tasks (may vary depending on the machine). If the iteration count is smaller than <code>M</code>, the condition task returns 0 to let the execution path go back to <code>clean_up</code>. Otherwise, it returns 1 to stop (i.e., no successor tasks at index 1). The taskflow graph is illustrated below:</p><div class="m-graph"><svg style="width: 61.700rem; height: 74.300rem;" viewBox="0.00 0.00 617.42 743.00">
<g transform="scale(1 1) rotate(0) translate(4 739)">
<title>Taskflow</title>
<g class="m-cluster">
<title>cluster_p0x1dcb6e0</title>
<polygon points="8,-8 8,-727 203.46,-727 203.46,-8 8,-8"/>
<text text-anchor="middle" x="105.73" y="-715" font-family="Helvetica,sans-Serif" font-size="10.00">Subflow: parallel&#45;for</text>
</g>
<g class="m-node m-flat">
<title>p0x1dcb4c0</title>
<ellipse cx="435.01" cy="-421" rx="27" ry="18"/>
<text text-anchor="middle" x="435.01" y="-418.5" font-family="Helvetica,sans-Serif" font-size="10.00">init</text>
</g>
<g class="m-node m-flat">
<title>p0x1dcb5d0</title>
<ellipse cx="574.1" cy="-375" rx="35.14" ry="18"/>
<text text-anchor="middle" x="574.1" y="-372.5" font-family="Helvetica,sans-Serif" font-size="10.00">clean_up</text>
</g>
<g class="m-edge">
<title>p0x1dcb4c0&#45;&gt;p0x1dcb5d0</title>
<path d="M459.58,-413.09C480.17,-406.18 510.45,-396.02 534.56,-387.93"/>
<polygon points="535.77,-391.22 544.13,-384.72 533.54,-384.58 535.77,-391.22"/>
</g>
<g class="m-node m-flat">
<title>p0x1dcb6e0</title>
<ellipse cx="153.66" cy="-358" rx="41.59" ry="18"/>
<text text-anchor="middle" x="153.66" y="-355.5" font-family="Helvetica,sans-Serif" font-size="10.00">parallel&#45;for</text>
</g>
<g class="m-edge">
<title>p0x1dcb5d0&#45;&gt;p0x1dcb6e0</title>
<path d="M538.62,-373.6C464.97,-370.61 292.03,-363.58 205.64,-360.07"/>
<polygon points="205.68,-356.57 195.55,-359.66 205.39,-363.56 205.68,-356.57"/>
</g>
<g class="m-node m-flat">
<title>p0x1dcb7f0</title>
<ellipse cx="284.85" cy="-329" rx="52.28" ry="18"/>
<text text-anchor="middle" x="284.85" y="-326.5" font-family="Helvetica,sans-Serif" font-size="10.00">update_cluster</text>
</g>
<g class="m-edge">
<title>p0x1dcb6e0&#45;&gt;p0x1dcb7f0</title>
<path d="M190.99,-349.84C203.24,-347.09 217.18,-343.97 230.47,-340.98"/>
<polygon points="231.47,-344.34 240.46,-338.74 229.94,-337.51 231.47,-344.34"/>
</g>
<g class="m-node">
<title>p0x1dcb900</title>
<polygon points="435.01,-347 374.47,-329 435.01,-311 495.56,-329 435.01,-347"/>
<text text-anchor="middle" x="435.01" y="-326.5" font-family="Helvetica,sans-Serif" font-size="10.00">converged?</text>
</g>
<g class="m-edge">
<title>p0x1dcb7f0&#45;&gt;p0x1dcb900</title>
<path d="M337.25,-329C345.94,-329 355.08,-329 364.14,-329"/>
<polygon points="364.3,-332.5 374.3,-329 364.3,-325.5 364.3,-332.5"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd610000b50</title>
<ellipse cx="45.43" cy="-682" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-679.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_0</text>
</g>
<g class="m-edge">
<title>p0x7fd610000b50&#45;&gt;p0x1dcb6e0</title>
<path d="M63.75,-668.48C67.9,-664.52 71.96,-659.92 74.87,-655 127.49,-565.95 145.03,-441.24 150.44,-386.1"/>
<polygon points="153.94,-386.32 151.37,-376.04 146.97,-385.67 153.94,-386.32"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd610000c60</title>
<ellipse cx="45.43" cy="-628" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-625.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_1</text>
</g>
<g class="m-edge">
<title>p0x7fd610000c60&#45;&gt;p0x1dcb6e0</title>
<path d="M63.57,-614.37C67.72,-610.41 71.83,-605.84 74.87,-601 118.89,-530.86 140.1,-433.58 148.34,-386.2"/>
<polygon points="151.84,-386.51 150.04,-376.07 144.94,-385.36 151.84,-386.51"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd610000d70</title>
<ellipse cx="45.43" cy="-574" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-571.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_2</text>
</g>
<g class="m-edge">
<title>p0x7fd610000d70&#45;&gt;p0x1dcb6e0</title>
<path d="M63.31,-560.2C67.46,-556.24 71.63,-551.72 74.87,-547 110.44,-495.1 134.06,-424.41 145.21,-385.87"/>
<polygon points="148.68,-386.47 148.03,-375.9 141.95,-384.57 148.68,-386.47"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd610000e80</title>
<ellipse cx="45.43" cy="-520" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-517.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_3</text>
</g>
<g class="m-edge">
<title>p0x7fd610000e80&#45;&gt;p0x1dcb6e0</title>
<path d="M62.9,-505.9C67.05,-501.94 71.33,-497.49 74.87,-493 102.11,-458.4 126.17,-413.21 140.07,-384.89"/>
<polygon points="143.35,-386.15 144.55,-375.62 137.05,-383.1 143.35,-386.15"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd610000f90</title>
<ellipse cx="45.43" cy="-466" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-463.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_4</text>
</g>
<g class="m-edge">
<title>p0x7fd610000f90&#45;&gt;p0x1dcb6e0</title>
<path d="M61.9,-451.55C66.14,-447.53 70.71,-443.13 74.87,-439 93.76,-420.21 114.65,-398.41 130.02,-382.15"/>
<polygon points="132.77,-384.34 137.09,-374.66 127.68,-379.53 132.77,-384.34"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd6100010a0</title>
<ellipse cx="45.43" cy="-412" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-409.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_5</text>
</g>
<g class="m-edge">
<title>p0x7fd6100010a0&#45;&gt;p0x1dcb6e0</title>
<path d="M67.42,-401.33C81.47,-394.19 100.33,-384.6 116.73,-376.26"/>
<polygon points="118.73,-379.18 126.05,-371.53 115.55,-372.94 118.73,-379.18"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd6100011b0</title>
<ellipse cx="45.43" cy="-358" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-355.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_6</text>
</g>
<g class="m-edge">
<title>p0x7fd6100011b0&#45;&gt;p0x1dcb6e0</title>
<path d="M72.69,-358C81.45,-358 91.52,-358 101.51,-358"/>
<polygon points="101.82,-361.5 111.82,-358 101.82,-354.5 101.82,-361.5"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd6100012c0</title>
<ellipse cx="45.43" cy="-304" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-301.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_7</text>
</g>
<g class="m-edge">
<title>p0x7fd6100012c0&#45;&gt;p0x1dcb6e0</title>
<path d="M67.42,-314.67C81.47,-321.81 100.33,-331.4 116.73,-339.74"/>
<polygon points="115.55,-343.06 126.05,-344.47 118.73,-336.82 115.55,-343.06"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd6100013d0</title>
<ellipse cx="45.43" cy="-250" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-247.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_8</text>
</g>
<g class="m-edge">
<title>p0x7fd6100013d0&#45;&gt;p0x1dcb6e0</title>
<path d="M61.9,-264.45C66.14,-268.47 70.71,-272.87 74.87,-277 93.76,-295.79 114.65,-317.59 130.02,-333.85"/>
<polygon points="127.68,-336.47 137.09,-341.34 132.77,-331.66 127.68,-336.47"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd6100014e0</title>
<ellipse cx="45.43" cy="-196" rx="27" ry="18"/>
<text text-anchor="middle" x="45.43" y="-193.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_9</text>
</g>
<g class="m-edge">
<title>p0x7fd6100014e0&#45;&gt;p0x1dcb6e0</title>
<path d="M62.9,-210.1C67.05,-214.06 71.33,-218.51 74.87,-223 102.11,-257.6 126.17,-302.79 140.07,-331.11"/>
<polygon points="137.05,-332.9 144.55,-340.38 143.35,-329.85 137.05,-332.9"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd6100015f0</title>
<ellipse cx="45.43" cy="-142" rx="29.37" ry="18"/>
<text text-anchor="middle" x="45.43" y="-139.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_10</text>
</g>
<g class="m-edge">
<title>p0x7fd6100015f0&#45;&gt;p0x1dcb6e0</title>
<path d="M63.91,-156.37C67.86,-160.19 71.79,-164.51 74.87,-169 110.44,-220.9 134.06,-291.59 145.21,-330.13"/>
<polygon points="141.95,-331.43 148.03,-340.1 148.68,-329.53 141.95,-331.43"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd610001700</title>
<ellipse cx="45.43" cy="-88" rx="29.37" ry="18"/>
<text text-anchor="middle" x="45.43" y="-85.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_11</text>
</g>
<g class="m-edge">
<title>p0x7fd610001700&#45;&gt;p0x1dcb6e0</title>
<path d="M64.17,-102.2C68.11,-106.02 71.97,-110.39 74.87,-115 118.89,-185.14 140.1,-282.42 148.34,-329.8"/>
<polygon points="144.94,-330.64 150.04,-339.93 151.84,-329.49 144.94,-330.64"/>
</g>
<g class="m-node m-flat">
<title>p0x7fd610001810</title>
<ellipse cx="45.43" cy="-34" rx="29.37" ry="18"/>
<text text-anchor="middle" x="45.43" y="-31.5" font-family="Helvetica,sans-Serif" font-size="10.00">pfg_12</text>
</g>
<g class="m-edge">
<title>p0x7fd610001810&#45;&gt;p0x1dcb6e0</title>
<path d="M64.34,-48.09C68.28,-51.92 72.1,-56.31 74.87,-61 127.49,-150.05 145.03,-274.76 150.44,-329.9"/>
<polygon points="146.97,-330.33 151.37,-339.96 153.94,-329.68 146.97,-330.33"/>
</g>
<g class="m-edge">
<title>p0x1dcb900&#45;&gt;p0x1dcb5d0</title>
<path stroke-dasharray="5,2" d="M471.96,-336.21C487.22,-339.7 505.08,-344.37 520.78,-350 527,-352.23 533.47,-354.96 539.65,-357.79"/>
<polygon points="538.52,-361.12 549.05,-362.25 541.52,-354.8 538.52,-361.12"/>
<text text-anchor="middle" x="517.28" y="-353" font-family="Helvetica,sans-Serif" font-size="10.00">0</text>
</g>
</g>
</svg>
</div><p>The scheduler starts with <code>init</code>, moves on to <code>clean_up</code>, and then enters the parallel-for task <code>paralle-for</code> that spawns a subflow of 12 workers to perform parallel iterations. When <code>parallel-for</code> completes, it updates the cluster centroids and checks if they have converged through a condition task. If not, the condition task informs the scheduler to go back to <code>clean_up</code> and then <code>parallel-for</code>; otherwise, it returns a nominal index to stop the scheduler.</p></section><section id="KMeansBenchmarking"><h2><a href="#KMeansBenchmarking">Benchmarking</a></h2><p>Based on the discussion above, we compare the runtime of computing various k-means problem sizes between a sequential CPU and parallel CPUs on a machine of 12 Intel i7-8700 CPUs at 3.2 GHz.</p><table class="m-table"><thead><tr><th>N</th><th>K</th><th>M</th><th>CPU Sequential</th><th>CPU Parallel</th></tr></thead><tbody><tr><td>10</td><td>5</td><td>10</td><td>0.14 ms</td><td>77 ms</td></tr><tr><td>100</td><td>10</td><td>100</td><td>0.56 ms</td><td>86 ms</td></tr><tr><td>1000</td><td>10</td><td>1000</td><td>10 ms</td><td>98 ms</td></tr><tr><td>10000</td><td>10</td><td>10000</td><td>1006 ms</td><td>713 ms</td></tr><tr><td>100000</td><td>10</td><td>100000</td><td>102483 ms</td><td>49966 ms</td></tr></tbody></table><p>When the number of points is larger than 10K, the parallel CPU implementation starts to outperform the sequential CPU implementation.</p></section>
      </div>
    </div>
  </div>
</article></main>
<div class="m-doc-search" id="search">
  <a href="#!" onclick="return hideSearch()"></a>
  <div class="m-container">
    <div class="m-row">
      <div class="m-col-m-8 m-push-m-2">
        <div class="m-doc-search-header m-text m-small">
          <div><span class="m-label m-default">Tab</span> / <span class="m-label m-default">T</span> to search, <span class="m-label m-default">Esc</span> to close</div>
          <div id="search-symbolcount">&hellip;</div>
        </div>
        <div class="m-doc-search-content">
          <form>
            <input type="search" name="q" id="search-input" placeholder="Loading &hellip;" disabled="disabled" autofocus="autofocus" autocomplete="off" spellcheck="false" />
          </form>
          <noscript class="m-text m-danger m-text-center">Unlike everything else in the docs, the search functionality <em>requires</em> JavaScript.</noscript>
          <div id="search-help" class="m-text m-dim m-text-center">
            <p class="m-noindent">Search for symbols, directories, files, pages or
            modules. You can omit any prefix from the symbol or file path; adding a
            <code>:</code> or <code>/</code> suffix lists all members of given symbol or
            directory.</p>
            <p class="m-noindent">Use <span class="m-label m-dim">&darr;</span>
            / <span class="m-label m-dim">&uarr;</span> to navigate through the list,
            <span class="m-label m-dim">Enter</span> to go.
            <span class="m-label m-dim">Tab</span> autocompletes common prefix, you can
            copy a link to the result using <span class="m-label m-dim">⌘</span>
            <span class="m-label m-dim">L</span> while <span class="m-label m-dim">⌘</span>
            <span class="m-label m-dim">M</span> produces a Markdown link.</p>
          </div>
          <div id="search-notfound" class="m-text m-warning m-text-center">Sorry, nothing was found.</div>
          <ul id="search-results"></ul>
        </div>
      </div>
    </div>
  </div>
</div>
<script src="search-v2.js"></script>
<script src="searchdata-v2.js" async="async"></script>
<footer><nav>
  <div class="m-container">
    <div class="m-row">
      <div class="m-col-l-10 m-push-l-1">
        <p>Taskflow handbook is part of the <a href="https://taskflow.github.io">Taskflow project</a>, copyright © <a href="https://tsung-wei-huang.github.io/">Dr. Tsung-Wei Huang</a>, 2018&ndash;2023.<br />Generated by <a href="https://doxygen.org/">Doxygen</a> 1.9.1 and <a href="https://mcss.mosra.cz/">m.css</a>.</p>
      </div>
    </div>
  </div>
</nav></footer>
</body>
</html>
